20 research outputs found
Open-architecture Implementation of Fragment Molecular Orbital Method for Peta-scale Computing
We present our perspective and goals on highperformance computing for
nanoscience in accordance with the global trend toward "peta-scale computing."
After reviewing our results obtained through the grid-enabled version of the
fragment molecular orbital method (FMO) on the grid testbed by the Japanese
Grid Project, National Research Grid Initiative (NAREGI), we show that FMO is
one of the best candidates for peta-scale applications by predicting its
effective performance in peta-scale computers. Finally, we introduce our new
project constructing a peta-scale application in an open-architecture
implementation of FMO in order to realize both goals of highperformance in
peta-scale computers and extendibility to multiphysics simulations.Comment: 6 pages, 9 figures, proceedings of the 2nd IEEE/ACM international
workshop on high performance computing for nano-science and technology
(HPCNano06
Implementation and Evaluation of Fock Matrix Calculation Program on the Cell Processor
ICCMSE 2007 : 25-30 September 2007 : GreeceVarious processor architectures have been proposed until today, and the performance has improved remarkably. Recently, the Chip Multi-processors (CMPs), which has many processor cores onto a chip, are proposed for further performance improvement. The Cell processor is one of such CMP and shows high computational performance. Although this processor is designed for the multimedia, that high performance character can be utilized to molecular orbital calculation. In this study we implemented Fock matrix construction program on the Cell processor, and evaluated computational performance. As a result, there were two kinds of main stalls by the branch prediction and the data alignment, which are controlled by software mechanism for the simplification of the Cell processor hardware. It is possible to improve the performance about 30%, if the branch prediction hit ratio could be improved to 99%. For data alignment stall, a part of stalls, which is originated by data shuffle pipeline, could be decreased by preparing hardware data alignment mechanism
Implementation of Parallel Fragment Molecular Orbital Calculation Program Using One-Sided Communication Functionality
フラグメント分子軌道(FMO)法に基づいた量子化学計算を、数万~数 10 万プロセッサを用いた超並列計算機で効率よく動作するソースコード作成を目的としている。計算で用いる大容量のデータを分散して保存し、それに対するアクセスを効率よく行うために、MPI-2 で実装されている片側通信機構を用いてコーディングを行っている。作成中のコードとペタスケールインターコネクト技術開発プロジェクトで開発された性能評価ツールなどを用いたベンチマークテストの結果や、通信コストを推定した結果から、データの分散保存を行った並列 FMO コードが、通信コストの面で、超並列計算機でも効率よく動作することが確認できた。The fragment molecular orbital (FMO) method is one of the promising technique to calculate the electronic structure of large-scale molecules such as proteins and DNAs, and it is suitable for parallel processing. Considering the effective execution of FMO program on massively parallel computers, it is desired to store the density matrix data appearing in FMO calculation in distributed manner. We\u27ve been trying to implement the FMO program with the distributed density matrix storage using the one-sided communication functionality in MPI-2 standard. The results of benchmark test using the intermediate code showed that the FMO program with the distributed density matrix storage is more effective than one with non-distributed storage in massively parallel execution